Trade&Ahead

Context

The stock market has consistently proven to be a good place to invest in and save for the future. There are a lot of compelling reasons to invest in stocks. It can help in fighting inflation, create wealth, and also provides some tax benefits. Good steady returns on investments over a long period of time can also grow a lot more than seems possible. Also, thanks to the power of compound interest, the earlier one starts investing, the larger the corpus one can have for retirement. Overall, investing in stocks can help meet life's financial aspirations.

It is important to maintain a diversified portfolio when investing in stocks in order to maximize earnings under any market condition. Having a diversified portfolio tends to yield higher returns and face lower risk by tempering potential losses when the market is down. It is often easy to get lost in a sea of financial metrics to analyze while determining the worth of a stock, and doing the same for a multitude of stocks to identify the right picks for an individual can be a tedious task. By doing a cluster analysis, one can identify stocks that exhibit similar characteristics and ones that exhibit minimum correlation. This will help investors better analyze stocks across different market segments and help protect against risks that could make the portfolio vulnerable to losses.

Objective

Trade&Ahead is a financial consultancy firm who provide their customers with personalized investment strategies. They have data comprising stock price and some financial indicators for a few companies listed under the New York Stock Exchange. They need to analyze the data, grouping the stocks based on the attributes provided, and sharing insights about the characteristics of each group.

Data Description

Importing necessary libraries and data

Check for any missing data

Let's check the duplicate data.

Let us check the data type of the columns

Lets check the different unique values for the object columns

Exploratory Data Analysis (EDA)

Univariate Analysis

Functions for plotting

Bivariate Analysis

Questions:

  1. What does the distribution of stock prices look like?
  2. The stocks of which economic sector have seen the maximum price increase on average?
  3. How are the different variables correlated with each other?
  4. Cash ratio provides a measure of a company's ability to cover its short-term obligations using only cash and cash equivalents. How does the average cash ratio vary across economic sectors?
  5. P/E ratios can help determine the relative value of a company's shares as they signify the amount of money an investor is willing to invest in a single share of a company per dollar of its earnings. How does the P/E ratio vary, on average, across economic sectors?
  1. What does the distribution of stock prices look like?
  1. The stocks of which economic sector have seen the maximum price increase on average?
  1. How are the different variables correlated with each other?
  1. Cash ratio provides a measure of a company's ability to cover its short-term obligations using only cash and cash equivalents. How does the average cash ratio vary across economic sectors?
  1. P/E ratios can help determine the relative value of a company's shares as they signify the amount of money an investor is willing to invest in a single share of a company per dollar of its earnings. How does the P/E ratio vary, on average, across economic sectors?

Data Preprocessing

EDA

K-means Clustering

Let's check the silhouette scores.

Let's take 8 as the appropriate no. of clusters as the silhouette score is high and visual representation of the clusters at 8 gives an even distribution.

Cluster Profiling

Insights

Hierarchical Clustering

Let's explore different linkage methods with Euclidean distance only.

Let's see the dendrogram for Mahalanobis and Manhattan distances with average and weighted linkage methods (as they gave high cophenetic correlation values).

Cluster Profiling

Let's create 10 clusters.

Let's create 8 clusters.

Let's create 7 clusters.

Insights

K-means vs Hierarchical Clustering

Actionable Insights and Recommendations

Dimensionality Reduction using PCA for visualization